Supervised Classification of Healthcare Text Data Based on Context-Defined Categories
نویسندگان
چکیده
Achieving a good success rate in supervised classification analysis of text dataset, where the relationship between and its label can be extracted from context, but not isolated words text, is still an important challenge facing fields statistics machine learning. For this purpose, we present novel mathematical framework. We then conduct comparative study established methods for case corresponding clearly depicted by specific text. In particular, use logistic LASSO, artificial neural networks, support vector machines, decision-tree-like procedures. This methodology applied to real involving mapping Consolidated Framework Implementation Research (CFIR) constructs health-related data achieves prediction over 80% when just first 55% or more, used training remaining testing. The results indicate that useful accelerate CFIR coding process.
منابع مشابه
OmniCat: Automatic Text Classification with Dynamically Defined Categories
We present OmniCat, an ontology-based text categorization method that classifies documents into a dynamically defined set of categories specified as contexts in the domain ontology. The method does not require a training set and is based on measuring the semantic similarity of the thematic graph created from a text document and the ontology fragments created by the projection of the defined con...
متن کاملText Classification Based On Manifold Semi- Supervised Support Vector Machine
This article presents a solution along with experimental results for an application of semi-supervised machine learning techniques and improvement on the SVM (Support Vector Machine) based on geodesic model to build text classification applications for Vietnamese language. The objective here is to improve the semi-supervised machine learning by replacing the kernel function of SVM using geodesi...
متن کاملDetecting Concept Drift in Data Stream Using Semi-Supervised Classification
Data stream is a sequence of data generated from various information sources at a high speed and high volume. Classifying data streams faces the three challenges of unlimited length, online processing, and concept drift. In related research, to meet the challenge of unlimited stream length, commonly the stream is divided into fixed size windows or gradual forgetting is used. Concept drift refer...
متن کاملText Summarization Based on Conceptual Data Classification
In this paper, we present an original approach for text summarization using conceptual data classification. We show how a given text can be summarized without losing meaningful knowledge and without using any semantic or grammatical concepts. In fact, concept date classification is used to extract the most interacting sentences from the main text and ignoring the other meaningless sentences in ...
متن کاملSemi-supervised Collaborative Text Classification
Most text categorization methods require text content of documents that is often difficult to obtain. We consider “Collaborative Text Categorization”, where each document is represented by the feedback from a large number of users. Our study focuses on the semisupervised case in which one key challenge is that a significant number of users have not rated any labeled document. To address this pr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Mathematics
سال: 2022
ISSN: ['2227-7390']
DOI: https://doi.org/10.3390/math10122005